Should recognizers have ears?
نویسنده
چکیده
Recently, techniques motivated by human auditory perception are being applied in main-stream speech technology and there seems to be renewed interest in implementing more knowledge of human speech communication into a design of a speech recognizer. The paper discusses the author's experience with applying auditory knowledge to automatic recognition of speech. It advances the notion that the reason for applying of such a knowledge in speech engineering should be the ability of perception to suppress some parts of the irrelevant information in the speech message and argues against the blind implementation of scattered accidental knowledge which may be irrelevant to a speech recognition task. The following three properties of human speech perception are discussed in some detail: · limited spectral resolution, · use of information from about syllable-length segments, · ability to ignore corrupted or irrelevant components of speech. It shows by referring to published works that selective use of auditory knowledge, optimized on and in some cases derived from real speech data, can be consistent with current stochastic approaches to ASR and could yield advantages in practical engineering applications. Ó 1998 Elsevier Science B.V. All rights reserved.
منابع مشابه
Using Articulatory Knowledge in Automatic Speech Recognition
Over the years different types of speech recognizers have been proposed and tested. During the last decade (or maybe even longer) hidden Markov models (HMMs) seem to have a better performance than other types of speech recognizers, like e.g. rule-based speech recognizers. This state of affairs has led to a gap between speech technology on the one hand, and phonetics and phonology on the other. ...
متن کاملON GENERAL FUZZY RECOGNIZERS
In this paper, we de ne the concepts of general fuzzy recognizer, language recognized by a general fuzzy recognizer, the accessible and the coac- cessible parts of a general fuzzy recognizer and the reversal of a general fuzzy recognizer. Then we obtain the relationships between them and construct a topology and some hypergroups on a general fuzzy recognizer.
متن کاملEARS: Electromyographical Automatic Recognition of Speech
In this paper, we present our research on automatic speech recognition of surface electromyographic signals that are generated by the human articulatory muscles. With parallel recorded audible speech and electromyographic signals, experiments are conducted to show the anticipatory behavior of electromyographic signals with respect to speech signals. Additionally, we demonstrate how to develop p...
متن کاملCombining forward-based and backward-based decoders for improved speech recognition performance
Combining outputs of speech recognizers is a known way of increasing speech recognition performance. The ROVER approach handles efficiently such combinations. In this paper we show that the best performance is not achieved by combining the outputs of the best set of recognizers, but rather by combining outputs of recognizers that rely on different processing components, and in particular on a d...
متن کاملPolarimetric analysis of radar backscatter from ground-based scatterometers and wheat biomass monitoring with advanced synthetic aperture radar images
This article presents an analysis of the scattering measurements for an entire wheat growth cycle by ground-based scatterometers at a frequency of 5.3 GHz. Since wheat ears are related to wheat growth and yield, the radar backscatter of wheat was analyzed at two different periods, i.e., with and without wheat ears. Simultaneously, parameters such as wheat and soil characteristics as well as vol...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Speech Communication
دوره 25 شماره
صفحات -
تاریخ انتشار 1998